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Abstract 

ABSTRACT: Fix a > 0, and sample N integers uniformly at random 
from {l, 2, . . . , [e aAr J }. Given r\ > 0, the probability that the maximum of 
the pairwise CCDs lies between iV 2 "" and N 2+r > converges to 1 as N — > oo. 
More precise estimates are obtained. This is a Birthday Problem: two 
of the random integers are likely to share some prime factor of order 
N 2 / \og[N\. The proof generalizes to any arithmetical semigroup where 
a suitable form of the Prime Number Theorem is valid. 



1. Main Result 

Whereas the distribution of the sizes of the prime divisors of a random integer 
is a well studied subject — see portions of Billingsley (1999) — the authors are 
unaware of any published results on the pairwise Greatest Common Divisors 
(GCD) among a large collection of random integers. Theorem 1.1 establishes 
probabilistic upper and lower bounds for the maximum of these pairwise CCDs. 

1.1 Theorem 

Suppose a > 0, and T\, . . . , Tjy is a random sample, drawn with replacement, 
from the integers \n 6 N : n < e aN }. Let Tj_k denote the Greatest Common 
Divisor of Tj and Tk- For any r\ > 0, 



lim I 



N 2 ~ n < max {r 7 fc } < N 2+r> 

l<j<k<N J ' 



=1. (1) 



Indeed there are more precise estimates: for all s € (0, 1), and b > 0, the right 
side of (2) is finite, and 



max {r, ; fc } > N 2/s b 1/s 

l<j<k<N J ' 



< 



all < 2 > 



where V denotes the rational primes; while if Aj^ denotes the largest common 
prime factor of Tj and Tk, then for all 6 > 0, 



1 



lim P 



N 2 

max {Aj.fe} < -—r- 

l<k<j<N J log [N e ] 



< e" 



(3) 



Supplement: There is an upper bound, similar to (2), for the radical (i.c 
the largest square-free divisor) rad [Tj.k] of the GCD: 



1<J<k<N 



(4) 



The proof, which is omitted, uses methods similar to those of Proposition 2.2, 
based upon a Bernoulli model for occurrence of prime divisors, instead of a 
Geometric model for prime divisor multiplicities. For example, when s = 0.999, 
the product on the right side of (4) is approximately 12.44; for the right side of 
(2), it is approximately 17.64. 



1.2 Overview of the Proof of Theorem 1.1 

Let Z\ be a Bernoulli random variable, which takes the value 1 when prime 
Pi divides TV As a first step towards the proof, imagine proving a compa- 
rable result in the case where {Z\ , 1 < k < N,i > l} were independent, and 
P [Z^ = l] = 1 j pi . The harder parts of the proof arise in dealing with the 
reality that, for fixed k, {Z^,i > l} are negatively associated, and change with 
N. Convergence of the series 

p~ 2 iog[p] < oo 

ensures that the parameter a, which governs the range of integers being sampled, 
appears neither in (1), (2), nor (3). However the proof for the lower bound 
depends crucially on an exponential (in N) rate of growth in the range, in order 
to moderate the dependence among {Z^,i > l} for fixed k. 

Consider primes as labels on a set of urns; the random variable Tj contributes 
a ball to the urn labelled p if prime p divides Tj. The lower bound comes from 
showing that, with asymptotic probability at least 1 — e~ e / 8 , some urn with 
a label p > N 2 / log [iV e ] contains more than one ball; in that case prime p is 
a common divisor of two distinct members of the list T±, . . . , T/y. The upper 
bound comes from an exponential moment inequality. 

If T\ , . . . , Tjy were sampled uniformly without replacement from the integers 
from 1 to iV 2 , the lower bound (3) would fail; see the analysis in Billingsley 
(1999) of the distribution of the largest prime divisor of a random integer. In 
the case of sampling from integers from 1 to 7V r , where r > 3, the upper bound 
(2) remains valid, but we do not know whether the lower bound (3) holds or 
not. 
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1.3 Generalizations to Arithmetical Semigroups 

Although details will not be given, the techniques used to prove Theorem 1.1 will 
be valid in the more general context of a commutative semigroup G with identity 
element 1, containing a countably infinite subset V := {pi,P2, • ■ •} called the 
primes of G, such that every element a ^ 1 of G has a unique factorization of 
the form 

a = Hp?,(e 1 ,e 2 ,...) el™ 

i>l 

where all but finitely many (a) are zero. Assume in addition that G is an 
arithmetical semigroup in the sense of Knopfmacher (1990), meaning that there 
exists a real- valued norm • | on G such that: 

• |1| = 1, \pi\ > 1 for all pi £ V. 

• \ab\ — \a\\b\ for all a, b e G. 

• The set irc[x] := {i > 1 : \pi\ < e x } is finite, for each real x > 0. 

The only analytic condition needed is an abstract form of the Prime Number 
Theorem (see Knopfmacher (1990), Chapter 6): 

lim xe~ x \kg[x]\ = 1> 

x — >oo 

used in the proof of Proposition 4.1. This in turn will imply convergence of series 
such as: 

]Tio g [i + ipr 2 ], s <i, 

pev 

which appear (in an exponentiated form) in the bound (2). For example, Lan- 
dau's Prime Ideal Theorem provides such a result in the case where G is the set 
of integral ideals in an algebraic number field, V is the set of prime ideals, and 
\a\ is the norm of a. Knopfmacher (1990) also studies a more general setting 
where, for some S > 0, 

lim xe~ 6x \ttg[x} \ = S. 
The authors have not attempted to modify Theorem 1.1 to fit this case. 

2. Pairwise Minima in a Geometric Probability 
Model 

2.1 Geometric Random Vectors 

Let V := {pi,p2,...} denote the rational primes {2,3,5,...} in increasing 
order. Let X denote the set of non-negative integer vectors (ei, ■ ■ ■) for which 
J2 e i < 00 ■ Let Ai,A 2 ,... be (possibly dependent) positive integer random 
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variables, whose joint law has the property that, for every k E N. and every 
(ei, e2, • • •) 6l for which e k = 0, 



X k >m\ n {X t = ej 



(1 



Let C denote the random vector: 



C:= (li,l 2 ,..)eN f 



(5) 



(6) 



Consider the finite-dimensional projections of X\, X2, ■ ■ ■ as a general model 
for prime multiplicities in the prime factorization of a random integer, without 
specifying exactly how that integer will be sampled. Let . . . , £W be 

independent random vectors, all having the same law as C in (6). Write C (fc) 
as (X£,X$,...). Then 



L j>k :=^min{xf,X/'}log[pi: 



is a model for the log of the GCD of two such random integers. We shall now 
derive an upper bound for 



max {Lj k } , 

l<k<j<N J 



which models the log maximum of the pairwise GCD among a set of TV "large, 
random" integers. 

2.2 Proposition 

Assume the joint law of the components of Q satisfies (5). 
(i) For every s S (0, 1), the following expectation is finite: 



E [e'^] < lj (l + ^r^) =:C.<oo )fl <l. 



(7) 



(ii) For any s E (0, 1), and b > C s j 2, for C s as in (7), there is an upper 
bound: 



Ajv > log 



Iog[6] 



~ 2b 



(8) 



Proof: Consider first the case where X\ , X2, ■ . . are independent Geometric 
random variables, and 



¥[X k >m] = — m ,m = l,2,... 
\PkJ 

It is elementary to check that, for s E (0, 1), and any p G V, if X", X' are 
independent Geometric random variables with 
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P [X" >m}= p~ m = P [X 1 > m] , m = 1, 2, . . . , 
then their minimum is also a Geometric random variable, which satisfies 



E L*™*{x",x'} 

p . _ p . 

It follows from the independence assumption that 



E [e sLk ^\ = E 



smin{xf ,X\ } 



-n ' 



p?-i 
p 2 - p? 



c s . 



This verifies the assertion (7). Markov's inequality shows that, for any s £ (0, 1) 

C s > e st F [Lfcj > i] . 



Furthermore 



max {Lk A > t 

l<k<j<N ' JJ ~ 



U {Lk.j > t} 

l<k<j<N J 



i<fc<i<jv 

It follows that, for s 6 (0, 1), b > 0, and t := s" 1 log [6X 2 ] 



A N > log 



AT 2 /* 



* _1 Iog[6] 



N 2 t 
~ 2 



26' 



It remains to consider the case where X\,X2, . . ■ satisfies (5), without the 
independence assumption. Choose a probability space (O, J 7 , P) on which in- 
dependent Geometric random variables X{, X£, . . . and X", X£' , ■ • ■ arc defined, 
such that for alH > 1, 

P \X'l >m}= v7 n = P [XI > m] , m = 1, 2, . . . . 

We propose to construct C (1) = (X^X^,...) and C (2) = (Xf,X%,...) 
by induction, on this probability space (O,.? 7 , P), so that for each n > 1, 
{(X/, X 2 ) 1 < i < n} have the correct joint law, and 

X}<X[-Xl<X'l, i = 1,2,.... 
Once this is achieved, monotonicity implies 



E [( 



< E 



nsmin{X-,X t "} 



so the desired result will follow from the previous one for independent Geometric 
random variables. 
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Since C^ 1 ) and C^^are independent, it suffices to construct C^ 1 ) in terms of 
X[,X 2 ,..- so that X\ < X- for all i. Let (Uij,i > 1, j > 0) be independent 
Uniform(0, 1) random variables. Suppose either i = 1, or else some values Xj 1 = 
e\,X\ — e2, • • • ,X}_ 1 = ej_i have already been determined. By assumption, 
there exists parameters 



-X* > fc| n {X,- = ej} 

j<% 



Use these to construct X' and X} as follows: 



X[ := min jfc : U ifi U iA . . . U itk > (J^ 

X\ := mm{k : U ifi U hl . . . U i>k > q itk } < X-. 
This completes the construction and the proof, giving the result (8). □ 

3. Lower Bound for Largest Collision 

3.1 Random Vectors with Independent Components 

Let V := {pi,p2, ■ ■ ■} denote the rational primes {2,3,5,...} in increasing 
order, and let aj :— (log [pj]) 1//2 . Instead of the Geometric model (5), switch 
to a Bernoulli model in which Z\ , Z 2 , ■ . ■ are independent Bernoulli random 
variables, with 

¥[Zj = l}:=^. (9) 
Pj 

Let £ denote the random vector 

t:=(a 1 Z 1 ,a 2 Z 2 ,...) G [0,oo) N . (10) 

under this new assumption, and let ^ 1 \^ 2 \ . . . ,^ N ^ be independent random 
vectors, all having the same law as £. Note that £W -£( 2 ) is not a suitable model 
for the GCD of two random integers, because the independence assumption (9) 
is not realistic. However it is a useful context to develop the techniques which 
will establish the lower bound in Theorem 1.1. 

Write £( fc ) = (aiZ^,a 2 Z^, . . .). We seek a lower bound on the log of the 
largest prime pi at which a "collision" occurs, meaning that Z\ = 1 = Z\ for 
some j, k: 

A' N := max (max \z\Z\ log [p t }\ < max |c (fc) • } . 

l<k<j<N [ i I 1 1 L J J ~~ l< k <j<N I J 



G 



3.2 Proposition 

Given 6 e (0, oo), define tp N := <pn[S\ implicitly by the identity 



r N 2 dx _ 

J 2x 2 \og[x] 



(11) 



Under the assumption of independence of the components of the random vector 
(10), 



lim F[A' N >log[cp N [6]]}>l-e- 

TV— »oo 



(12) 



Remark: From the integration bounds: 



2(p N 



2tp N log [2ip N ] \og[2(p N \ j r 



dx 25 



2ip N 



N 2 log[<pjv] J r 



dx 1 

2ip N log [<p 



it follows that ipjy, defined in (11), satisfies ip n log [<p n] /N 2 — ► 0.25/(5. Hence 
for all sufficiently large N, ipN < N 2 j 2, and 



iV 2 



iV 2 



^ N > 45 log [2^] > 8Slog[NY (13) 

The proof uses the following technical Lemma, which the reader may treat 
as a warm-up exercise for the more difficult Proposition 4.1. 



3.3 Lemma 

Let Vn denote the set of primes p such that tp N < p < 2f N . Let 
{Z p ,p E VnA <k< iV} be independent Bernoulli random variables, where 
P[Z* = l]=l/p. Take D p := Zl + ... + . Then 



lim I 

N^oo 



U {D p >2} 

P&Vn 



Proof: Binomial probabilities give: 



1 - e" 



(14) 



'[D P <1}= 1 



1\ N N 



N-l 



N 



N 
p~l 



= 1 



N N(N-l) 



P 



2p 2 



N 
p~l 



N 2 
2p~ 2 



l- —-o [ 4-Vo 



<PnJ 



N 



Independence of {Z p ,p e Vn, 1 < k < N] implies independence of {D p ,p £ Vn}, 



so 
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log 



n {D P <i} 

P&Vn 



]T Iog[P[D p <l]] 



1 - 



+ o 



N 



N\V_ 



+ o 



n 3 \v n \ 

<Pn 



Using the estimates ipw log [<Pn] = 0(A^ 2 ), \Vn\ = {<£n/ log [yjv]), an d 
p/<fiN < 2, the last expression becomes 



x . iV 2 ( N 2 



2p 2 



N 



ip N log [<p 



o 



N 3 



v 2 N log b 



N 



All terms but the first vanish in the limit, while the Prime Number Theorem 
ensures that 

lim V" = S. 

N^oc ^ 2p 2 
P&Vn 



Therefore 



lim P 



n {D p <i} 



= e 



Thus the limit (14) follows. □ 



3.3.1 Proof of Proposition 



According to our model, if D p > 2 for some p = pi 6 TV, then there are indices 
1 < k < j < N for which Z{ = 1 = Z*. Since log [ K ] > log [pjv [<*]], 



lim P[A' N >\og[(p N [S}}}> lim 1 

N—>oo iV— »oo 



U {# P >2} 



1 - e" 



This verifies (12). □ 



4. Application: Pairwise GCDs of Many Uniform 
Random Integers 

We shall now prove an analogue of Lemma 3.3 which applies to random inte- 
gers, dropping the independence assumption for the components of the random 
vector (10). 
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4.1 Proposition 

Suppose a > 0, and T\, . . . , TV is a random sample, drawn with replacement, 
from the integers {n E N : n < e aAr }. Given S E (0, oo), define ipx '■= ^PnIS] 
implicitly by the identity (11) . Let Vn denote the set of primes p such 
that ipN < p < 2ipN- for p E Vn let D p denote the number of elements of 
{Ti, . . . , Tjv} which are divisible by p . Then 



lim 



1 - e~ s . (15) 
Proof: As noted above, the Prime Number Theorem ensures that 



U {D p >2} 

P&Vn 



hm — j = S. 

N^oo ^ 2n 2 
pev N F 

More generally, the alternating series for the exponential function ensures that 
there is an even integer al > 1 such that, given e E (0, 1), for all sufficiently large 
N, 



i _ e -s/(i+e) < sr(-iy +i i r < 1 



e -«/(l-«) 



r=l 



where, for {pi, . . . ,p r } C Vn 

2 r ( Pl ... Pr 



N 2r 

I r F39E:= Y, srfa, ^ 2 ,r = l,2,...,d. 
p 1 <...<p r 



Because ipn/N 2 — > 0, it follows that, for every {pi, . . . ,pd} C "Pat, 

Pl---Pd (tPN) d 2dlog[N]- a N _^ n 

Suppose that, for this constant value of d, we fix some {pi, . . . ,pd} C Vn', 
instead of sampling T\,...,Tn uniformly from integers up to e aN , sample 
T[ , . . . , T' N uniformly from integers up to 

Pl ...p d [e aN /( Pl ...p d )\ . 

^From symmetry considerations, the Bernoulli random variables B[ , . . . , B' d 
are independent, with parameters 1 jp\ , . . . , 1 jp d , respectively where B[ is the 
indicator of the event that pi divides T[. By elementary reasoning, 

AT 2 

¥[D P >2} = —+0((N/ IPN ) 3 ); 



N 2r 

¥[D pl >2,...,D Pr >2) = — - + 0((N/<p N ) 2r+1 ),r=l,2,...,d. 

z [pi . . . p r ) 
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If we were to sample T\, . . . , Tn instead of T{, . . . , T' N , the most that such a 
probability could change is 



A r 



U {Ti ? T!} 



< 



N Pl ■■■Pd e (2d+l)\o & [N]-aN 



The same estimate holds for any choice of {pi, ■ ■ ■ ,Pd} C Tn- By the 
inclusion-exclusion formula, taken to the first d terms, 



U {D p >2} 

P&Vn 



> V[D P >2]- J2 V 1 D pi > 2 ' D P2 >2]+...- V[D Pl >2,...,D Pd >2] 



pev N 



Pl<P2 



P!<...<p d 



j2(-iy +1 ir + o((N/v N ) 3 ) + ( N d 

r=l ^ 



N \ e (2d+l)\og[N]-aN 



So under this simplified model, the reasoning above combines to show that, 
for all sufficiently large N, 



1 - e -V(i+0 < 



U {D p >2} 

P&Vn 



<l- e -*/(i- e ). 



Since e can be made arbitrarily small, this verifies the result. □ 
4.2 Proof of Theorem 1.1 

Suppose a > 0, and Ti, . . . ,Tn is a random sample, drawn with replacement, 
from the integers {n € N : n < e aN }. Let Aj^ denote the largest common prime 
factor of Tj and Xfc. Take 

A'^ := max {log [A jifc ]} . 

l<k<]<N 

In the language of Proposition 4.1, if £> p > 2 for some p £ 7-V, then there 
are indices 1 < fc < j < AT for which Aj.k > fN- So inequality (13) and 
Proposition 4.1 imply that, for any 6 — 8<5 > 



Jim P [A' N > 2 log[iV] - log [log [N e ]]] > lim P [A'„ > log [^[0/8]]] 

N — >oo iV — >oo 



> lim P 

A^ — >oc 



U p P >2} 

pev N 



l-e- / 8 . 



This is precisely the lower bound (3). For any r\ > 0, the lower bound in (1) 
follows from: 



lim P[A' N > (2 - n) log[AT]] = 1. 
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Let Tj.fe > Aj.fe denote the Greatest Common Divisor of Tj and To 
obtain the upper bound (2) on Tj.k, it suffices by Proposition 2.2 to check 
that condition (5) is valid, when denotes the multiplicity to which prime pi 
divides T±. Take any positive integer r > 1, any prime pk coprime to r, and any 
m > 1. The conditional probability that p™ divides T\, given that r divides T\, 
is 

[e aN /(rp^)\ < f 1 



Le^/rJ " \ Pk/ 

So condition (5) holds. Thus (8) holds, which is equivalent to (2). 

Finally we derive the upper bound in (1), for an arbitrary r\ > 0. Fix 
e e (0, 1) and r\ > 0. Select s E (0, 1) to satisfy 2/s = 2 + rj/2. Then choose 
b= C'J e. According to (8), 

P [A N > (2 + r?/2) log[AT] + s- 1 log[&]] < e/2. 
For any N sufficiently large so that {rj/2) log [AT] > log[6], 

P [A N > (2 + n) \og[N]} < e/2. 
This yields the desired bound (1). □ 
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